AI-master.dev

LLM & Modèles 🟢 Débutant 4 min

ICML 2026 Seoul: 6,500+ papers accepted, ML enters the agentic era — key takeaways

Explore AI trends at ICML 2026 Seoul: over 6,500 accepted papers and the agentic era in machine learning.

2026-07-04 16:00

LLM & Modèles 🟢 Débutant 12 min

Claude Sonnet 5: Anthropic's most agentic model, Opus performance at Sonnet price

2026-07-01 15:02

LLM & Modèles 🟢 Débutant 12 min

OpenAI GPT-5.6: Sol, Terra et Luna — the model family that changes everything

Discover OpenAI GPT-5.6: Sol, Terra and Luna, the revolutionary model family under direct government control from June 26, 2026.

2026-06-29 15:03

LLM & Modèles 🟢 Débutant 15 min

GPT-5.6 Sol: OpenAI launches the preview of a new model amid the early price war

Discover GPT-5.6 Sol, OpenAI's new preview shaking up the AI market amid a price war. Analysis and stakes of this launch.

2026-06-28 15:06

LLM & Modèles 🟢 Débutant 12 min

Poolside Laguna M.1: the 225B open-source model for the coding agent, Apache 2.0

Discover Poolside Laguna M.1, a 225B-parameter open-source model under Apache 2.0, built to revolutionize coding agents.

2026-06-27 18:06

LLM & Modèles 🟢 Débutant 15 min

FrontierCode: Cognition's benchmark that buries SWE-Bench and ranks code agents by the real quality of pull requests — Fable 5 at 46.3%, Opus 4.8 at 34.3%, GPT-5.5 at 25.5%

Discover FrontierCode, Cognition's new benchmark replacing SWE-Bench by evaluating the real quality of code agents' pull requests.

2026-06-26 17:03

LLM & Modèles 🟢 Débutant 15 min

DeepSWE: the benchmark proving that code agents were cheating — Artificial Analysis buries SWE-Bench

Discover DeepSWE, the new benchmark replacing SWE-Bench, proving code agents were cheating. Analysis of the rankings upended by Artificial Anal

2026-06-22 16:02

LLM & Modèles 🟢 Débutant 16 min

Gemini 3.5 Pro: countdown — 10 days before Google's deadline, 2 million tokens and Deep Think mode, the most anticipated model of the year (amidst a talent chaos)

Gemini 3.5 Pro: 10 days before Google's deadline, discover the rumors about its 2 million tokens and Deep Think mode amid a talent chaos.

2026-06-20 17:05

LLM & Modèles 🟢 Débutant 17 min

GLM-5.2: The most powerful open weights model in the world — 753B MoE, 1M context, MIT license, the LLM landscape shifts

Discover GLM-5.2 from Z.ai: the world's most powerful open weights model. 753B MoE, 1M context & MIT license shaking up the LLM landscape.

2026-06-18 15:02

LLM & Modèles 🟢 Débutant 13 min

CacheRL: A Qwen3-4B model achieves 92% accuracy in tool-calling with 100 times less compute than GPT-5

Discover CacheRL: a Qwen3-4B model hits 92% tool-calling accuracy with 100x less compute than GPT-5. AI revolution!

2026-06-16 17:02

LLM & Modèles 🟢 Débutant 11 min

Best LLM Code (June 2026)

Discover the ultimate comparison of the best coding LLMs in June 2026. Analysis of agentic models capable of coding without human supervision.

2026-06-16 03:01

LLM & Modèles 🟢 Débutant 13 min

Best Local LLMs (June 2026)

Discover the final ranking of the best local LLMs in June 2026. DeepSeek V4 Pro, Ollama: compare quality and privacy.

2026-06-15 03:02

LLM & Modèles 🟢 Débutant 13 min

Kimi K2.7-Code : the 1T parameter open-source coding model that cuts 30% of reasoning tokens and beats Opus in tool use

Discover Kimi K2.7-Code, a 1T-parameter open-source coding model cutting reasoning tokens by 30% and outperforming Opus in tool use.

2026-06-14 17:01

LLM & Modèles 🟢 Débutant 15 min

DeepSeek V4-Pro : the permanent 75% price drop accelerating the LLM war

DeepSeek V4-Pro permanently drops its price by 75%. Discover how this LLM model disrupts the market and accelerates the AI war.

2026-06-12 18:01

LLM & Modèles 🟢 Débutant 13 min

Qwen3 Coder Next : the open-source model that runs on a 64 GB Mac and beats DeepSeek in coding

Discover Qwen3 Coder Next, the open-source model running on a 64GB Mac and beating DeepSeek at coding. A revolution for local code!

2026-06-12 16:02

LLM & Modèles 🟢 Débutant 16 min

DiffusionGemma : Google releases the first open source diffusion text model — 4x faster than autoregressive

Discover DiffusionGemma: Google's first open-source diffusion text model, 4x faster than classic autoregressive approaches.

2026-06-11 15:04

LLM & Modèles 🟢 Débutant 11 min

Best LLMs (June 2026)

Discover the full June 2026 best LLM ranking after the GPT-5.5 release. Compare autonomous AI models and their reasoning.

2026-06-11 03:01

LLM & Modèles 🟢 Débutant 13 min

Claude Fable 5: Anthropic makes its Mythos model accessible to the public

Anthropic launches Claude Fable 5, the first public version of its Mythos model. Discover this model deemed too powerful and its explosive scores.

2026-06-10 15:02

LLM & Modèles 🟢 Débutant 12 min

Best Free Llms (June 2026)

Discover the ranking of the best free LLMs in June 2026. Market analysis and comparison of uncensored AI models.

2026-06-09 04:01

LLM & Modèles 🟢 Débutant 12 min

DeepSeek's DeepEP: the open source lib that optimizes GPU communication for large-scale MoE models

DeepSeek releases DeepEP, an open-source library that optimizes GPU communication to accelerate large-scale MoE model training.

2026-06-05 19:01

LLM & Modèles 🟢 Débutant 14 min

NVIDIA Nemotron 3 Ultra 550B: The most powerful open-source model in the US arrives at Computex

Discover NVIDIA Nemotron 3 Ultra 550B, the most powerful US open-source model unveiled at Computex 2026 to rival China.

2026-06-04 18:01

LLM & Modèles 🟢 Débutant 13 min

MiniMax M3: the Chinese open-weights model defying GPT-5.5 with 1M context and MSA architecture

Discover MiniMax M3, the Chinese open-weights model challenging GPT-5.5. It offers 1 million context tokens via MSA architecture.

2026-06-04 16:01

LLM & Modèles 🟢 Débutant 13 min

DeepSeek V3.1: the silent revolution of open source arrives under the MIT license

DeepSeek V3.1 disrupts open source AI with a 671B parameter model under MIT license, with zero commercial restrictions.

2026-06-02 19:01

LLM & Modèles 🟢 Débutant 13 min

Claude Opus 4.8: the model that dethrones GPT-5.5 — benchmarks, Dynamic Workflows, and the future of the coding agent

Anthropic's Claude Opus 4.8 dethrones GPT-5.5. Discover its benchmarks, the Dynamic Workflows system, and the coding agent revolution.

2026-05-31 17:01

LLM & Modèles 🟢 Débutant 14 min

GPIC : Stanford releases 28 trillion pixels to train image generation models

Stanford releases GPIC, a 28-trillion-pixel dataset for training image generation models. Discover this permissive dataset.

2026-05-30 17:01

LLM & Modèles 🟢 Débutant 14 min

LLMSurgeon: this ACL 2026 paper opens the black box of LLM pre-training

Discover LLMSurgeon, the ACL 2026 paper that opens the LLM pre-training black box to reveal their secret data mix.

2026-05-29 18:01

LLM & Modèles 🟢 Débutant 13 min

Qwen3-Coder-Next : 80B MoE with 3B active, the open-source code agent that rivals Claude Sonnet

Discover Qwen3-Coder-Next: an 80B MoE (3B active) open-source code model rivaling Claude Sonnet on SWE-Bench.

2026-05-28 16:01

LLM & Modèles 🟢 Débutant 15 min

OSCAR: Together AI open-sources a 2-bit KV cache quantization that reduces memory by 8x

Discover OSCAR: Together AI's open-source 2-bit KV cache quantization that cuts memory by 8x and optimizes LLM serving.

2026-05-26 15:03

LLM & Modèles 🟢 Débutant 14 min

Stanford AI Index 2026 : the 5 figures that show AI has passed a point of no return

Discover the Stanford AI Index 2026 and 5 key figures proving AI has crossed a point of no return.

2026-05-24 18:02

LLM & Modèles 🟢 Débutant 14 min

Gated DeltaNet-2 : the Yejin Choi paper that solves the oldest problem of linear attention

Discover Gated DeltaNet-2, Yejin Choi's paper that finally solves the oldest problem of linear attention in AI models.

2026-05-23 17:03

LLM & Modèles 🟢 Débutant 12 min

Cursor Composer 2.5: The coding model that rivals Opus 4.7 at a tenth of the price

Discover Cursor Composer 2.5, a coding model rivaling Claude Opus 4.7 at a tenth of the price. AI price war analysis.

2026-05-22 16:04

LLM & Modèles 🟢 Débutant 15 min

DeepWeb-Bench: The new benchmark that exposes the weaknesses of AI search agents

Discover DeepWeb-Bench, the new benchmark proving AI search agent scores are inflated and exposing their true weaknesses.

2026-05-21 18:01

LLM & Modèles 🟢 Débutant 11 min

Gemini 3.5 Flash : the fast model that beats Opus 4.7 and GPT-5.5 on agent benchmarks — 289 tokens/second

Discover Gemini 3.5 Flash: the ultra-fast model at 289 tokens/sec beating Claude Opus 4.7 and GPT-5.5 on agent benchmarks.

2026-05-20 14:09

LLM & Modèles 🟢 Débutant 14 min

General Preference RL: this paper unifies reinforcement learning and preference optimization for LLMs

Discover the General Preference RL paper unifying reinforcement learning and preference optimization to solve LLM post-training.

2026-05-19 18:01

LLM & Modèles 🟢 Débutant 12 min

OpenAI Parameter Golf: The challenge that proves small models are the future of AI

Discover the OpenAI Parameter Golf challenge: why compressing an LLM into 16 MB proves small models are the future of AI.

2026-05-18 17:02

LLM & Modèles 🟢 Débutant 15 min

Meta Muse Spark: why Meta betrayed open-source — the first closed model from the Superintelligence Lab

Discover why Meta Muse Spark is a turning point: the first closed model from the Superintelligence Lab that betrays Meta's open-source promise.

2026-05-18 15:04

LLM & Modèles 🟢 Débutant 14 min

MeMo : Memory as a Model — memory as an autonomous model for updating LLMs without retraining

Discover MeMo (Memory as a Model): the innovative solution to update LLMs without retraining and defeat knowledge obsolescence.

2026-05-16 19:01

LLM & Modèles 🟢 Débutant 15 min

SDAR: how to train AI agents with reinforcement learning without breaking them — self-distillation agentic

Discover SDAR (Self-Distillation Agentic Reinforcement): the method to train your AI agents with reinforcement learning without breaking them.

2026-05-16 18:02

LLM & Modèles 🟢 Débutant 13 min

OpenDeepThink : Bradley-Terry comparison-based parallel reasoning changes the game for LLM inference

Discover OpenDeepThink: how Bradley-Terry comparison parallel reasoning revolutionizes LLM inference and outperforms sequential chain-of-thought

2026-05-15 17:05

LLM & Modèles 🟢 Débutant 13 min

Negation Neglect : when fine-tuning makes LLMs blind to the false

Discover the Negation Neglect phenomenon: how fine-tuning LLMs against fake news ends up making them blind to falsehoods.

2026-05-14 19:01

LLM & Modèles 🟢 Débutant 16 min

KV-Fold : The training-free trick that revolutionizes long-context inference in LLMs

Discover KV-Fold, the training-free trick revolutionizing LLM long-context inference and solving the token management nightmare.

2026-05-13 18:06

LLM & Modèles 🟢 Débutant 16 min

Attractor Models: the new architecture that beats Transformers at reasoning

Discover Attractor Models, the new AI architecture that outperforms Transformers on reasoning at equivalent parameters.

2026-05-13 17:06

LLM & Modèles 🟢 Débutant 12 min

Translate this title to English: UniPool : the newcomer in MoE architectures decouples network depth from expert growth

Discover UniPool, the innovation revolutionizing MoE architectures by decoupling network depth from expert growth.

2026-05-10 15:21

LLM & Modèles 🟢 Débutant 10 min

Best Free Llms (May 2026)

Discover the best free LLMs of May 2026. Our comparison decides to find the ideal open source or freemium AI without paying.

2026-05-09 15:11

LLM & Modèles 🟢 Débutant 13 min

VaultGemma: Google DeepMind releases the world's most powerful differentially private LLM

Discover VaultGemma, the world's most powerful differentially private LLM by Google DeepMind. Mathematical guarantees for your data.

2026-05-09 15:00

LLM & Modèles 🟢 Débutant 15 min

Subquadratic stealth sort with SubQ: 12 million context tokens, the end of quadratic attention?

Subquadratic unveils SubQ: a revolutionary AI model handling 12M context tokens and ending quadratic attention.

2026-05-09 05:37

Tokens, contexte, coûts : comprendre la facturation des LLM

LLM & Modèles 🟢 Débutant 16 min

Tokens, context, costs: understanding LLM billing

Understand LLM billing: tokens, context window, cost calculation & 2026 price comparison chart. 12 tips to cut your expenses.

2026-02-24 10:26

Claude, GPT, Gemini, Llama : quel modèle choisir en 2026 ?

LLM & Modèles 🟢 Débutant 12 min

Claude, GPT, Gemini, Llama: Which Model to Choose in 2026?

Choosing a language model (LLM) in 2026 is a bit like choosing a car: there’s no universal "best"—only the best for you. Between Anthropic’s Claude, OpenAI’s...

2026-02-24 09:51

LLM & Modèles 🟢 Débutant 14 min

SigLoMa: a quadruped robot that learns manipulation in the real world using vision alone

Meet SigLoMa, a revolutionary quadruped robot that learns real-world manipulation tasks using vision alone. Explore the future of robotics.

2026-05-06 18:36

LLM & Modèles 🟢 Débutant 13 min

Qwen3.6: Alibaba arrives with a new family of LLM models

Discover Qwen3.6, Alibaba's new LLM family. With its MoT architecture (35B-A3B), rival GPT-4 at a lower cost. Deployment guide inc

2026-05-05 22:03

📂 LLM & Models

ICML 2026 Seoul: 6,500+ papers accepted, ML enters the agentic era — key takeaways

Claude Sonnet 5: Anthropic's most agentic model, Opus performance at Sonnet price

OpenAI GPT-5.6: Sol, Terra et Luna — the model family that changes everything

GPT-5.6 Sol: OpenAI launches the preview of a new model amid the early price war

Poolside Laguna M.1: the 225B open-source model for the coding agent, Apache 2.0

FrontierCode: Cognition's benchmark that buries SWE-Bench and ranks code agents by the real quality of pull requests — Fable 5 at 46.3%, Opus 4.8 at 34.3%, GPT-5.5 at 25.5%

DeepSWE: the benchmark proving that code agents were cheating — Artificial Analysis buries SWE-Bench

Gemini 3.5 Pro: countdown — 10 days before Google's deadline, 2 million tokens and Deep Think mode, the most anticipated model of the year (amidst a talent chaos)

GLM-5.2: The most powerful open weights model in the world — 753B MoE, 1M context, MIT license, the LLM landscape shifts

CacheRL: A Qwen3-4B model achieves 92% accuracy in tool-calling with 100 times less compute than GPT-5

Best LLM Code (June 2026)

Best Local LLMs (June 2026)

Kimi K2.7-Code : the 1T parameter open-source coding model that cuts 30% of reasoning tokens and beats Opus in tool use

DeepSeek V4-Pro : the permanent 75% price drop accelerating the LLM war

Qwen3 Coder Next : the open-source model that runs on a 64 GB Mac and beats DeepSeek in coding

DiffusionGemma : Google releases the first open source diffusion text model — 4x faster than autoregressive

Best LLMs (June 2026)

Claude Fable 5: Anthropic makes its Mythos model accessible to the public

Best Free Llms (June 2026)

DeepSeek's DeepEP: the open source lib that optimizes GPU communication for large-scale MoE models

NVIDIA Nemotron 3 Ultra 550B: The most powerful open-source model in the US arrives at Computex

MiniMax M3: the Chinese open-weights model defying GPT-5.5 with 1M context and MSA architecture

DeepSeek V3.1: the silent revolution of open source arrives under the MIT license

Claude Opus 4.8: the model that dethrones GPT-5.5 — benchmarks, Dynamic Workflows, and the future of the coding agent

GPIC : Stanford releases 28 trillion pixels to train image generation models

LLMSurgeon: this ACL 2026 paper opens the black box of LLM pre-training

Qwen3-Coder-Next : 80B MoE with 3B active, the open-source code agent that rivals Claude Sonnet

OSCAR: Together AI open-sources a 2-bit KV cache quantization that reduces memory by 8x

Stanford AI Index 2026 : the 5 figures that show AI has passed a point of no return

Gated DeltaNet-2 : the Yejin Choi paper that solves the oldest problem of linear attention

Cursor Composer 2.5: The coding model that rivals Opus 4.7 at a tenth of the price

DeepWeb-Bench: The new benchmark that exposes the weaknesses of AI search agents

Gemini 3.5 Flash : the fast model that beats Opus 4.7 and GPT-5.5 on agent benchmarks — 289 tokens/second

General Preference RL: this paper unifies reinforcement learning and preference optimization for LLMs

OpenAI Parameter Golf: The challenge that proves small models are the future of AI

Meta Muse Spark: why Meta betrayed open-source — the first closed model from the Superintelligence Lab

MeMo : Memory as a Model — memory as an autonomous model for updating LLMs without retraining

SDAR: how to train AI agents with reinforcement learning without breaking them — self-distillation agentic

OpenDeepThink : Bradley-Terry comparison-based parallel reasoning changes the game for LLM inference

Negation Neglect : when fine-tuning makes LLMs blind to the false

KV-Fold : The training-free trick that revolutionizes long-context inference in LLMs

Attractor Models: the new architecture that beats Transformers at reasoning

Translate this title to English: UniPool : the newcomer in MoE architectures decouples network depth from expert growth

Best Free Llms (May 2026)

VaultGemma: Google DeepMind releases the world's most powerful differentially private LLM

Subquadratic stealth sort with SubQ: 12 million context tokens, the end of quadratic attention?

Tokens, context, costs: understanding LLM billing

Claude, GPT, Gemini, Llama: Which Model to Choose in 2026?

SigLoMa: a quadruped robot that learns manipulation in the real world using vision alone

Qwen3.6: Alibaba arrives with a new family of LLM models